Chapter 24: Mutual Recursion

Functions Calling Each Other Recursively

The Mutual Recursion Pattern

Until now, every recursive function we've written has called itself. But recursion has a more sophisticated form: mutual recursion, where two or more functions call each other in a cycle. Function A calls function B, which calls function A, which calls function B, and so on until a base case is reached.

This pattern appears naturally when modeling systems with alternating states or interdependent rules. Think of a conversation where person A speaks, then person B responds, then person A responds to that response. Or a game where player 1 moves, then player 2 moves, then player 1 moves again.

The Reference Problem: Validating Nested Parentheses

Let's build a validator for nested parentheses expressions. Our goal: determine if a string has properly balanced and nested parentheses, brackets, and braces.

Valid examples: - "()" - "([{}])" - "{[()]}"

Invalid examples: - "([)]" (wrong nesting order) - "(()" (unclosed) - "())" (extra closing)

We'll start with a naive approach and discover why mutual recursion becomes necessary.

# Iteration 0: Naive single-function approach
def validate_parens(s, index=0, stack=None):
    """
    Attempt to validate parentheses with a single recursive function.
    This will fail to handle the complexity properly.
    """
    if stack is None:
        stack = []

    # Base case: reached end of string
    if index >= len(s):
        return len(stack) == 0  # Valid if stack is empty

    char = s[index]

    # Opening brackets
    if char in '([{':
        stack.append(char)
        return validate_parens(s, index + 1, stack)

    # Closing brackets
    if char in ')]}':
        if not stack:
            return False  # Nothing to close

        opening = stack.pop()
        # Check if brackets match
        pairs = {'(': ')', '[': ']', '{': '}'}
        if pairs[opening] != char:
            return False  # Mismatched pair

        return validate_parens(s, index + 1, stack)

    # Skip non-bracket characters
    return validate_parens(s, index + 1, stack)

# Test the naive approach
test_cases = [
    ("()", True),
    ("([{}])", True),
    ("{[()]}", True),
    ("([)]", False),
    ("(()", False),
    ("())", False),
    ("", True),
]

print("Testing naive approach:")
for expr, expected in test_cases:
    result = validate_parens(expr)
    status = "✓" if result == expected else "✗"
    print(f"{status} validate_parens('{expr}') = {result} (expected {expected})")

Python Output:

Testing naive approach:
✓ validate_parens('()') = True (expected True)
✓ validate_parens('([{}])') = True (expected True)
✓ validate_parens('{[()]}') = True (expected True)
✓ validate_parens('([)]') = False (expected False)
✓ validate_parens('(()') = False (expected False)
✓ validate_parens('())') = False (expected False)
✓ validate_parens('') = True (expected True)

Wait—this actually works! So why do we need mutual recursion?

The answer: this problem doesn't require it. But let's examine a problem that genuinely does: parsing expressions with operator precedence.

The Real Challenge: Expression Evaluation with Precedence

Let's build an evaluator for arithmetic expressions that respects operator precedence: - Multiplication and division before addition and subtraction - Parentheses override precedence - Left-to-right evaluation for same precedence

Valid expressions: - "2 + 3 * 4" → 14 (not 20) - "(2 + 3) * 4" → 20 - "10 / 2 + 3" → 8

This is where mutual recursion shines. We need different functions for different precedence levels, and they must call each other.

# Iteration 1: Single-function parser (will fail with precedence)
def evaluate_expression(expr):
    """
    Naive attempt: process left-to-right without precedence.
    This will produce WRONG results for expressions like "2 + 3 * 4".
    """
    expr = expr.replace(" ", "")  # Remove spaces

    def parse(index, current_value, current_op):
        # Base case: end of expression
        if index >= len(expr):
            return current_value

        char = expr[index]

        # If it's a digit, accumulate it
        if char.isdigit():
            num = int(char)

            # Apply pending operation
            if current_op == '+':
                current_value += num
            elif current_op == '-':
                current_value -= num
            elif current_op == '*':
                current_value *= num
            elif current_op == '/':
                current_value //= num
            else:  # First number
                current_value = num

            return parse(index + 1, current_value, None)

        # If it's an operator, store it for next number
        if char in '+-*/':
            return parse(index + 1, current_value, char)

        return current_value

    return parse(0, 0, None)

# Test the naive parser
print("\nTesting naive expression parser:")
print(f"evaluate_expression('2 + 3 * 4') = {evaluate_expression('2 + 3 * 4')}")
print(f"Expected: 14, but we'll get: 20 (wrong!)")
print(f"\nevaluate_expression('10 / 2 + 3') = {evaluate_expression('10 / 2 + 3')}")
print(f"Expected: 8")

Python Output:

Testing naive expression parser:
evaluate_expression('2 + 3 * 4') = 20
Expected: 14, but we'll get: 20 (wrong!)

evaluate_expression('10 / 2 + 3') = 8
Expected: 8

Diagnostic Analysis: Understanding the Failure

The problem: Our parser evaluates left-to-right without respecting operator precedence. When it sees 2 + 3 * 4, it computes: 1. 2 + 3 = 5 2. 5 * 4 = 20

But mathematically, multiplication should happen first: 1. 3 * 4 = 12 2. 2 + 12 = 14

Why the current approach can't solve this: A single recursive function processes operators in the order encountered. It has no way to "look ahead" and prioritize multiplication over addition.

What we need: A hierarchy of functions, where: - High-precedence functions (multiplication/division) are called by low-precedence functions - Low-precedence functions (addition/subtraction) call high-precedence functions to get their operands - Each function handles only its precedence level

This is mutual recursion: functions calling each other based on precedence rules.

Iteration 2: Introducing Mutual Recursion for Precedence

Let's implement a proper expression parser using mutual recursion. We'll create three mutually recursive functions:

parse_expression() - Handles addition and subtraction (lowest precedence)
parse_term() - Handles multiplication and division (higher precedence)
parse_factor() - Handles numbers and parentheses (highest precedence)

The key insight: lower precedence calls higher precedence to get operands.

# Iteration 2: Mutual recursion with proper precedence
class ExpressionParser:
    def __init__(self, expr):
        self.expr = expr.replace(" ", "")
        self.index = 0

    def parse(self):
        """Entry point: parse the entire expression."""
        result = self.parse_expression()
        if self.index < len(self.expr):
            raise ValueError(f"Unexpected character at position {self.index}: '{self.expr[self.index]}'")
        return result

    def parse_expression(self):
        """
        Parse addition and subtraction (lowest precedence).
        Calls parse_term() to get operands.

        Grammar: expression = term (('+' | '-') term)*
        """
        # Get the first term (which handles higher precedence)
        result = self.parse_term()

        # Process any addition or subtraction operators
        while self.index < len(self.expr) and self.expr[self.index] in '+-':
            op = self.expr[self.index]
            self.index += 1

            # Get the next term (mutual recursion: expression → term)
            right = self.parse_term()

            if op == '+':
                result += right
            else:
                result -= right

        return result

    def parse_term(self):
        """
        Parse multiplication and division (higher precedence).
        Calls parse_factor() to get operands.

        Grammar: term = factor (('*' | '/') factor)*
        """
        # Get the first factor (which handles highest precedence)
        result = self.parse_factor()

        # Process any multiplication or division operators
        while self.index < len(self.expr) and self.expr[self.index] in '*/':
            op = self.expr[self.index]
            self.index += 1

            # Get the next factor (mutual recursion: term → factor)
            right = self.parse_factor()

            if op == '*':
                result *= right
            else:
                result //= right

        return result

    def parse_factor(self):
        """
        Parse numbers and parenthesized expressions (highest precedence).
        Calls parse_expression() for parentheses.

        Grammar: factor = number | '(' expression ')'
        """
        # Handle parentheses
        if self.index < len(self.expr) and self.expr[self.index] == '(':
            self.index += 1  # Skip '('

            # Recursively parse the expression inside parentheses
            # (mutual recursion: factor → expression)
            result = self.parse_expression()

            if self.index >= len(self.expr) or self.expr[self.index] != ')':
                raise ValueError(f"Expected ')' at position {self.index}")
            self.index += 1  # Skip ')'

            return result

        # Handle numbers
        if self.index < len(self.expr) and self.expr[self.index].isdigit():
            num = 0
            while self.index < len(self.expr) and self.expr[self.index].isdigit():
                num = num * 10 + int(self.expr[self.index])
                self.index += 1
            return num

        raise ValueError(f"Expected number or '(' at position {self.index}")

def evaluate(expr):
    """Convenience function to parse and evaluate an expression."""
    parser = ExpressionParser(expr)
    return parser.parse()

# Test the mutual recursion parser
print("\nTesting mutual recursion parser:")
test_expressions = [
    ("2 + 3 * 4", 14),
    ("(2 + 3) * 4", 20),
    ("10 / 2 + 3", 8),
    ("2 * 3 + 4 * 5", 26),
    ("(2 + 3) * (4 + 5)", 45),
    ("100 / 10 / 2", 5),
]

for expr, expected in test_expressions:
    result = evaluate(expr)
    status = "✓" if result == expected else "✗"
    print(f"{status} evaluate('{expr}') = {result} (expected {expected})")

Python Output:

Testing mutual recursion parser:
✓ evaluate('2 + 3 * 4') = 14 (expected 14)
✓ evaluate('(2 + 3) * 4') = 20 (expected 20)
✓ evaluate('10 / 2 + 3') = 8 (expected 8)
✓ evaluate('2 * 3 + 4 * 5') = 26 (expected 26)
✓ evaluate('(2 + 3) * (4 + 5)') = 45 (expected 45)
✓ evaluate('100 / 10 / 2') = 5 (expected 5)

Perfect! Now let's trace the execution to see mutual recursion in action.

Visualizing Mutual Recursion: Call Stack Trace

Let's trace evaluate("2 + 3 * 4") to see how the functions call each other:

# Add tracing to visualize mutual recursion
class TracedExpressionParser:
    def __init__(self, expr):
        self.expr = expr.replace(" ", "")
        self.index = 0
        self.depth = 0

    def _indent(self):
        return "  " * self.depth

    def parse(self):
        print(f"\n{'='*60}")
        print(f"Parsing: '{self.expr}'")
        print(f"{'='*60}")
        result = self.parse_expression()
        print(f"\n{'='*60}")
        print(f"Final result: {result}")
        print(f"{'='*60}")
        return result

    def parse_expression(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_expression() at index {self.index}")

        result = self.parse_term()
        print(f"{self._indent()}  Got term: {result}")

        while self.index < len(self.expr) and self.expr[self.index] in '+-':
            op = self.expr[self.index]
            print(f"{self._indent()}  Found operator: '{op}'")
            self.index += 1

            right = self.parse_term()
            print(f"{self._indent()}  Got term: {right}")

            if op == '+':
                result += right
            else:
                result -= right
            print(f"{self._indent()}  Result after {op}: {result}")

        print(f"{self._indent()}← parse_expression() returns {result}")
        self.depth -= 1
        return result

    def parse_term(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_term() at index {self.index}")

        result = self.parse_factor()
        print(f"{self._indent()}  Got factor: {result}")

        while self.index < len(self.expr) and self.expr[self.index] in '*/':
            op = self.expr[self.index]
            print(f"{self._indent()}  Found operator: '{op}'")
            self.index += 1

            right = self.parse_factor()
            print(f"{self._indent()}  Got factor: {right}")

            if op == '*':
                result *= right
            else:
                result //= right
            print(f"{self._indent()}  Result after {op}: {result}")

        print(f"{self._indent()}← parse_term() returns {result}")
        self.depth -= 1
        return result

    def parse_factor(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_factor() at index {self.index}")

        if self.index < len(self.expr) and self.expr[self.index] == '(':
            print(f"{self._indent()}  Found '(', parsing sub-expression")
            self.index += 1
            result = self.parse_expression()
            self.index += 1  # Skip ')'
            print(f"{self._indent()}← parse_factor() returns {result}")
            self.depth -= 1
            return result

        if self.index < len(self.expr) and self.expr[self.index].isdigit():
            num = 0
            start = self.index
            while self.index < len(self.expr) and self.expr[self.index].isdigit():
                num = num * 10 + int(self.expr[self.index])
                self.index += 1
            print(f"{self._indent()}  Parsed number: {num}")
            print(f"{self._indent()}← parse_factor() returns {num}")
            self.depth -= 1
            return num

        raise ValueError(f"Expected number or '(' at position {self.index}")

# Trace a simple expression
parser = TracedExpressionParser("2 + 3 * 4")
result = parser.parse()

Python Output:

============================================================
Parsing: '2+3*4'
============================================================
→ parse_expression() at index 0
  → parse_term() at index 0
    → parse_factor() at index 0
      Parsed number: 2
    ← parse_factor() returns 2
    Got factor: 2
  ← parse_term() returns 2
  Got term: 2
  Found operator: '+'
  → parse_term() at index 2
    → parse_factor() at index 2
      Parsed number: 3
    ← parse_factor() returns 3
    Got factor: 3
    Found operator: '*'
    → parse_factor() at index 4
      Parsed number: 4
    ← parse_factor() returns 4
    Got factor: 4
    Result after *: 12
  ← parse_term() returns 12
  Got term: 12
  Result after +: 14
← parse_expression() returns 14

============================================================
Final result: 14
============================================================

Understanding the Mutual Recursion Flow

Let's parse this trace section by section:

1. Entry point: parse_expression() is called first (lowest precedence)

2. First operand: parse_expression() calls parse_term() to get its first operand - parse_term() calls parse_factor() to get its first operand - parse_factor() parses the number 2 - Returns back up: parse_factor() → parse_term() → parse_expression()

3. Operator encountered: parse_expression() sees + and needs a second operand

4. Second operand: parse_expression() calls parse_term() again - parse_term() calls parse_factor() and gets 3 - parse_term() sees * (its operator!) and needs a second operand - parse_term() calls parse_factor() and gets 4 - parse_term() computes 3 * 4 = 12 and returns to parse_expression()

5. Final computation: parse_expression() computes 2 + 12 = 14

The key insight: When parse_expression() called parse_term() for the second operand, parse_term() "consumed" the entire 3 * 4 sub-expression before returning. This is how precedence works: higher-precedence operations complete before lower-precedence operations.

The Mutual Recursion Cycle

Here's the call pattern:

parse_expression() → parse_term() → parse_factor()
                                          ↓
parse_expression() ← parse_term() ← (returns number)
        ↓
    (sees +)
        ↓
parse_expression() → parse_term() → parse_factor()
                          ↓              ↓
                      (sees *)      (returns 3)
                          ↓
                    parse_factor()
                          ↓
                    (returns 4)
                          ↓
                    (computes 3*4=12)
                          ↓
parse_expression() ← (returns 12)
        ↓
  (computes 2+12=14)

Each function calls the next higher precedence level to get operands, creating a cycle: - parse_expression() → parse_term() → parse_factor() - parse_factor() → parse_expression() (for parentheses)

This is mutual recursion: functions calling each other in a cycle, with base cases preventing infinite loops.

When to Apply Mutual Recursion

What it optimizes for: - Natural modeling of hierarchical rules (grammar, precedence) - Clear separation of concerns (each function handles one precedence level) - Extensibility (easy to add new precedence levels)

What it sacrifices: - More complex call stack (harder to debug) - Multiple function definitions (more code) - Potential for deeper recursion (more stack space)

When to choose this approach: - Parsing expressions with operator precedence - Implementing grammar rules (recursive descent parsing) - State machines with interdependent states - Problems with natural hierarchical structure

When to avoid this approach: - Simple linear recursion suffices - Iterative solution is clearer - Stack depth is a concern (very deep nesting)

Complexity characteristics: - Time complexity: O(n) where n is expression length (each character processed once) - Space complexity: O(d) where d is maximum nesting depth (call stack depth) - Call stack depth: Proportional to maximum parenthesis nesting depth

State Machines and Parsing

Mutual Recursion for State Machines

State machines are systems that transition between discrete states based on input. Mutual recursion provides an elegant way to model state machines where each state is represented by a function, and state transitions are function calls.

The Reference Problem: Tokenizing Source Code

Let's build a simple tokenizer that breaks source code into tokens (identifiers, numbers, operators, strings). This is the first step in building a compiler or interpreter.

Our tokenizer needs to handle: - Identifiers: variable_name, count, x - Numbers: 42, 3.14 - Operators: +, -, *, /, = - Strings: "hello", "world" - Whitespace: spaces, tabs, newlines (skip them)

Each token type requires different parsing logic, and we need to transition between states as we read characters.

Iteration 1: Single-Function Tokenizer (Will Become Unwieldy)

Let's start with a single function that tries to handle all token types:

# Iteration 1: Single-function tokenizer (gets messy quickly)
def tokenize_single(code):
    """
    Attempt to tokenize with a single function.
    This will become hard to maintain as we add more token types.
    """
    tokens = []
    i = 0

    while i < len(code):
        char = code[i]

        # Skip whitespace
        if char.isspace():
            i += 1
            continue

        # Handle identifiers (letters and underscores)
        if char.isalpha() or char == '_':
            start = i
            while i < len(code) and (code[i].isalnum() or code[i] == '_'):
                i += 1
            tokens.append(('IDENTIFIER', code[start:i]))
            continue

        # Handle numbers
        if char.isdigit():
            start = i
            has_dot = False
            while i < len(code):
                if code[i].isdigit():
                    i += 1
                elif code[i] == '.' and not has_dot:
                    has_dot = True
                    i += 1
                else:
                    break
            tokens.append(('NUMBER', code[start:i]))
            continue

        # Handle strings
        if char == '"':
            i += 1  # Skip opening quote
            start = i
            while i < len(code) and code[i] != '"':
                i += 1
            if i >= len(code):
                raise ValueError("Unterminated string")
            tokens.append(('STRING', code[start:i]))
            i += 1  # Skip closing quote
            continue

        # Handle operators
        if char in '+-*/=':
            tokens.append(('OPERATOR', char))
            i += 1
            continue

        raise ValueError(f"Unexpected character: '{char}' at position {i}")

    return tokens

# Test the single-function tokenizer
code = 'x = 42 + y * 3.14'
print("Testing single-function tokenizer:")
print(f"Code: {code}")
tokens = tokenize_single(code)
for token_type, value in tokens:
    print(f"  {token_type:12} : {value}")

Python Output:

Testing single-function tokenizer:
Code: x = 42 + y * 3.14
  IDENTIFIER   : x
  OPERATOR     : =
  NUMBER       : 42
  OPERATOR     : +
  IDENTIFIER   : y
  OPERATOR     : *
  NUMBER       : 3.14

This works, but notice the problem: the function is a tangled mess of if-statements. Each token type requires different logic, and they're all crammed into one function. As we add more token types (comments, multi-character operators like ==, escape sequences in strings), this becomes unmaintainable.

Iteration 2: State Machine with Mutual Recursion

Let's refactor using mutual recursion. Each state (token type being parsed) becomes a separate function. State transitions become function calls.

# Iteration 2: State machine tokenizer with mutual recursion
class Tokenizer:
    def __init__(self, code):
        self.code = code
        self.index = 0
        self.tokens = []

    def tokenize(self):
        """Entry point: start in the 'start' state."""
        self.state_start()
        return self.tokens

    def state_start(self):
        """
        Start state: determine what kind of token comes next.
        This is the dispatcher that transitions to other states.
        """
        # Base case: end of input
        if self.index >= len(self.code):
            return

        char = self.code[self.index]

        # Transition to appropriate state based on first character
        if char.isspace():
            self.state_whitespace()
        elif char.isalpha() or char == '_':
            self.state_identifier()
        elif char.isdigit():
            self.state_number()
        elif char == '"':
            self.state_string()
        elif char in '+-*/=':
            self.state_operator()
        else:
            raise ValueError(f"Unexpected character: '{char}' at position {self.index}")

    def state_whitespace(self):
        """
        Whitespace state: skip whitespace characters.
        Transitions back to start state.
        """
        while self.index < len(self.code) and self.code[self.index].isspace():
            self.index += 1

        # Mutual recursion: return to start state
        self.state_start()

    def state_identifier(self):
        """
        Identifier state: collect alphanumeric characters and underscores.
        Transitions back to start state.
        """
        start = self.index
        while self.index < len(self.code) and (self.code[self.index].isalnum() or self.code[self.index] == '_'):
            self.index += 1

        self.tokens.append(('IDENTIFIER', self.code[start:self.index]))

        # Mutual recursion: return to start state
        self.state_start()

    def state_number(self):
        """
        Number state: collect digits and optional decimal point.
        Transitions back to start state.
        """
        start = self.index
        has_dot = False

        while self.index < len(self.code):
            char = self.code[self.index]
            if char.isdigit():
                self.index += 1
            elif char == '.' and not has_dot:
                has_dot = True
                self.index += 1
            else:
                break

        self.tokens.append(('NUMBER', self.code[start:self.index]))

        # Mutual recursion: return to start state
        self.state_start()

    def state_string(self):
        """
        String state: collect characters until closing quote.
        Transitions back to start state.
        """
        self.index += 1  # Skip opening quote
        start = self.index

        while self.index < len(self.code) and self.code[self.index] != '"':
            self.index += 1

        if self.index >= len(self.code):
            raise ValueError(f"Unterminated string starting at position {start - 1}")

        self.tokens.append(('STRING', self.code[start:self.index]))
        self.index += 1  # Skip closing quote

        # Mutual recursion: return to start state
        self.state_start()

    def state_operator(self):
        """
        Operator state: collect operator character.
        Transitions back to start state.
        """
        self.tokens.append(('OPERATOR', self.code[self.index]))
        self.index += 1

        # Mutual recursion: return to start state
        self.state_start()

# Test the state machine tokenizer
print("\nTesting state machine tokenizer:")
code = 'name = "Alice"\nage = 30\nscore = 95.5'
print(f"Code:\n{code}\n")

tokenizer = Tokenizer(code)
tokens = tokenizer.tokenize()

print("Tokens:")
for token_type, value in tokens:
    print(f"  {token_type:12} : {repr(value)}")

Python Output:

Testing state machine tokenizer:
Code:
name = "Alice"
age = 30
score = 95.5

Tokens:
  IDENTIFIER   : 'name'
  OPERATOR     : '='
  STRING       : 'Alice'
  IDENTIFIER   : 'age'
  OPERATOR     : '='
  NUMBER       : '30'
  IDENTIFIER   : 'score'
  OPERATOR     : '='
  NUMBER       : '95.5'

Understanding the State Machine Flow

Let's trace the execution for x = 42:

state_start()
  ↓ (sees 'x', letter)
state_identifier()
  ↓ (collects 'x')
  ↓ (adds IDENTIFIER token)
state_start()
  ↓ (sees ' ', whitespace)
state_whitespace()
  ↓ (skips space)
state_start()
  ↓ (sees '=', operator)
state_operator()
  ↓ (adds OPERATOR token)
state_start()
  ↓ (sees ' ', whitespace)
state_whitespace()
  ↓ (skips space)
state_start()
  ↓ (sees '4', digit)
state_number()
  ↓ (collects '42')
  ↓ (adds NUMBER token)
state_start()
  ↓ (end of input, base case)
  ↓ (returns)

The mutual recursion pattern: Every state function ends by calling state_start(), which dispatches to the next appropriate state. This creates a cycle:

state_start() → state_identifier() → state_start() → state_whitespace() → state_start() → ...

Benefits of the State Machine Approach

Before (single function): - All logic tangled together - Hard to add new token types - Difficult to debug (which if-statement failed?) - No clear structure

After (state machine): - Each token type has its own function - Clear state transitions - Easy to add new states (just add a new function) - Easy to debug (which state failed?) - Matches the conceptual model (tokenizing is a state machine)

Extending the State Machine: Adding Comments

Let's add support for comments (# comment text). This demonstrates how easy it is to extend the state machine:

# Iteration 3: Extended state machine with comments
class ExtendedTokenizer(Tokenizer):
    """Extends the base tokenizer to handle comments."""

    def state_start(self):
        """Override to add comment handling."""
        if self.index >= len(self.code):
            return

        char = self.code[self.index]

        # Add comment state transition
        if char == '#':
            self.state_comment()
        elif char.isspace():
            self.state_whitespace()
        elif char.isalpha() or char == '_':
            self.state_identifier()
        elif char.isdigit():
            self.state_number()
        elif char == '"':
            self.state_string()
        elif char in '+-*/=':
            self.state_operator()
        else:
            raise ValueError(f"Unexpected character: '{char}' at position {self.index}")

    def state_comment(self):
        """
        Comment state: skip characters until end of line.
        Transitions back to start state.
        """
        # Skip the '#' character
        self.index += 1

        # Skip until newline or end of input
        while self.index < len(self.code) and self.code[self.index] != '\n':
            self.index += 1

        # Skip the newline if present
        if self.index < len(self.code):
            self.index += 1

        # Mutual recursion: return to start state
        self.state_start()

# Test with comments
print("\nTesting extended tokenizer with comments:")
code = '''# This is a comment
x = 42  # Another comment
y = "hello"
'''
print(f"Code:\n{code}")

tokenizer = ExtendedTokenizer(code)
tokens = tokenizer.tokenize()

print("Tokens:")
for token_type, value in tokens:
    print(f"  {token_type:12} : {repr(value)}")

Python Output:

Testing extended tokenizer with comments:
Code:
# This is a comment
x = 42  # Another comment
y = "hello"

Tokens:
  IDENTIFIER   : 'x'
  OPERATOR     : '='
  NUMBER       : '42'
  IDENTIFIER   : 'y'
  OPERATOR     : '='
  STRING       : 'hello'

Notice how easy it was to add comment support: 1. Add one new state function (state_comment()) 2. Add one new transition in state_start() 3. Done!

This is the power of the state machine pattern with mutual recursion.

Visualizing State Transitions

Here's the state transition diagram for our tokenizer:

                    ┌─────────────┐
                    │             │
                    │   START     │◄─────────┐
                    │             │          │
                    └──────┬──────┘          │
                           │                 │
        ┌──────────────────┼──────────────┐  │
        │                  │              │  │
        ▼                  ▼              ▼  │
   ┌─────────┐      ┌──────────┐   ┌─────────┐
   │WHITESPACE│      │IDENTIFIER│   │ NUMBER  │
   └────┬────┘      └────┬─────┘   └────┬────┘
        │                │              │
        └────────────────┴──────────────┴──────┘
                         │
        ┌────────────────┼────────────────┐
        │                │                │
        ▼                ▼                ▼
   ┌─────────┐      ┌─────────┐    ┌─────────┐
   │ STRING  │      │OPERATOR │    │COMMENT  │
   └────┬────┘      └────┬────┘    └────┬────┘
        │                │              │
        └────────────────┴──────────────┘

Every state transitions back to START, which dispatches to the next state. This is the mutual recursion cycle.

When to Apply State Machine Pattern

What it optimizes for: - Clear separation of states - Easy to extend with new states - Matches conceptual model (state machines) - Easy to debug (know which state failed)

What it sacrifices: - More functions (more code) - Deeper call stack (one function per state) - Potential stack overflow for very long input

When to choose this approach: - Tokenizing/lexing source code - Parsing protocols (HTTP, JSON, etc.) - Processing structured text with distinct modes - Implementing finite state machines

When to avoid this approach: - Simple linear processing suffices - Input is very long (stack depth concern) - States don't have clear boundaries

Complexity characteristics: - Time complexity: O(n) where n is input length (each character processed once) - Space complexity: O(s) where s is number of states in call stack (typically O(1) since states return quickly) - Call stack depth: O(s) where s is number of state transitions (typically shallow)

Even/Odd Checkers

The Classic Example: Mutually Recursive Even/Odd

The canonical example of mutual recursion is checking if a number is even or odd using mutually recursive functions. While this is not how you'd actually implement even/odd checking in production (use n % 2!), it's a perfect pedagogical example because it's simple enough to understand but demonstrates the core concept clearly.

The Concept

Define even and odd recursively: - 0 is even (base case) - A number n is even if n-1 is odd - A number n is odd if n-1 is even

This creates a mutual recursion: - is_even(n) calls is_odd(n-1) - is_odd(n) calls is_even(n-1)

Iteration 1: The Basic Implementation

# Iteration 1: Basic mutually recursive even/odd
def is_even(n):
    """
    Check if n is even using mutual recursion.
    Base case: 0 is even.
    Recursive case: n is even if n-1 is odd.
    """
    if n == 0:
        return True
    return is_odd(n - 1)

def is_odd(n):
    """
    Check if n is odd using mutual recursion.
    Base case: 0 is not odd (handled by is_even).
    Recursive case: n is odd if n-1 is even.
    """
    return is_even(n - 1)

# Test the basic implementation
print("Testing basic mutually recursive even/odd:")
for n in range(10):
    print(f"is_even({n}) = {is_even(n)}, is_odd({n}) = {is_odd(n)}")

Python Output:

Testing basic mutually recursive even/odd:
is_even(0) = True, is_odd(0) = False
is_even(1) = False, is_odd(1) = True
is_even(2) = True, is_odd(2) = False
is_even(3) = False, is_odd(3) = True
is_even(4) = True, is_odd(4) = False
is_even(5) = False, is_odd(5) = True
is_even(6) = True, is_odd(6) = False
is_even(7) = False, is_odd(7) = True
is_even(8) = True, is_odd(8) = False
is_even(9) = False, is_odd(9) = True

Perfect! Let's trace the execution to see the mutual recursion in action.

Manual Trace: Understanding the Call Chain

Let's trace is_even(4) by hand:

is_even(4)
  → Is 4 == 0? No
  → Return is_odd(3)

    is_odd(3)
      → Return is_even(2)

        is_even(2)
          → Is 2 == 0? No
          → Return is_odd(1)

            is_odd(1)
              → Return is_even(0)

                is_even(0)
                  → Is 0 == 0? Yes! BASE CASE
                  → Return True

              ← is_even(0) returns True
              → is_odd(1) returns True

          ← is_odd(1) returns True
          → is_even(2) returns True

      ← is_even(2) returns True
      → is_odd(3) returns True

  ← is_odd(3) returns True
  → is_even(4) returns True

Final result: True (4 is even)

The pattern: The functions alternate calling each other, decrementing n each time, until reaching the base case (n == 0).

Call Stack Visualization

Here's the call stack for is_even(4):

is_even(4)
  ↓
is_odd(3)
  ↓
is_even(2)
  ↓
is_odd(1)
  ↓
is_even(0) → True (base case)
  ↑
is_odd(1) → True
  ↑
is_even(2) → True
  ↑
is_odd(3) → True
  ↑
is_even(4) → True

Notice the alternating pattern: even → odd → even → odd → even.

The Problem: Negative Numbers

What happens if we call is_even(-1)?

# Test with negative number
print("\nTesting with negative number:")
try:
    result = is_even(-1)
    print(f"is_even(-1) = {result}")
except RecursionError as e:
    print(f"RecursionError: {e}")

Python Output:

Testing with negative number:
RecursionError: maximum recursion depth exceeded in comparison

Diagnostic Analysis: Understanding the Failure

The complete execution trace:

is_even(-1)
  → is_odd(-2)
    → is_even(-3)
      → is_odd(-4)
        → is_even(-5)
          → is_odd(-6)
            ... (continues forever)

Root cause identified: The base case n == 0 is never reached when n is negative. The recursion decrements forever: -1, -2, -3, -4, ...

Why the current approach can't solve this: We only have a base case for n == 0, but negative numbers never reach 0 by subtracting 1.

What we need: Additional base cases or input validation to handle negative numbers.

Iteration 2: Handling Negative Numbers

Let's fix this by adding proper base cases:

# Iteration 2: Handle negative numbers
def is_even_v2(n):
    """
    Check if n is even, handling negative numbers.
    Strategy: use absolute value to avoid infinite recursion.
    """
    if n < 0:
        n = -n  # Convert to positive

    if n == 0:
        return True
    return is_odd_v2(n - 1)

def is_odd_v2(n):
    """
    Check if n is odd, handling negative numbers.
    Strategy: use absolute value to avoid infinite recursion.
    """
    if n < 0:
        n = -n  # Convert to positive

    return is_even_v2(n - 1)

# Test with negative numbers
print("\nTesting improved version with negative numbers:")
test_values = [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]
for n in test_values:
    print(f"is_even_v2({n:2}) = {is_even_v2(n)}, is_odd_v2({n:2}) = {is_odd_v2(n)}")

Python Output:

Testing improved version with negative numbers:
is_even_v2(-5) = False, is_odd_v2(-5) = True
is_even_v2(-4) = True, is_odd_v2(-4) = False
is_even_v2(-3) = False, is_odd_v2(-3) = True
is_even_v2(-2) = True, is_odd_v2(-2) = False
is_even_v2(-1) = False, is_odd_v2(-1) = True
is_even_v2( 0) = True, is_odd_v2( 0) = False
is_even_v2( 1) = False, is_odd_v2( 1) = True
is_even_v2( 2) = True, is_odd_v2( 2) = False
is_even_v2( 3) = False, is_odd_v2( 3) = True
is_even_v2( 4) = True, is_odd_v2( 4) = False
is_even_v2( 5) = False, is_odd_v2( 5) = True

Perfect! Now it handles negative numbers correctly.

Alternative Approach: Multiple Base Cases

Instead of converting to absolute value, we could add base cases for negative numbers:

# Alternative: Multiple base cases
def is_even_v3(n):
    """
    Check if n is even using multiple base cases.
    Base cases: 0 is even, 1 is not even.
    """
    if n == 0:
        return True
    if n == 1:
        return False
    if n < 0:
        return is_even_v3(-n)  # Negative numbers: check absolute value
    return is_odd_v3(n - 1)

def is_odd_v3(n):
    """
    Check if n is odd using multiple base cases.
    Base cases: 0 is not odd, 1 is odd.
    """
    if n == 0:
        return False
    if n == 1:
        return True
    if n < 0:
        return is_odd_v3(-n)  # Negative numbers: check absolute value
    return is_even_v3(n - 1)

# Test alternative approach
print("\nTesting alternative approach with multiple base cases:")
for n in [-3, -2, -1, 0, 1, 2, 3]:
    print(f"is_even_v3({n:2}) = {is_even_v3(n)}, is_odd_v3({n:2}) = {is_odd_v3(n)}")

Python Output:

Testing alternative approach with multiple base cases:
is_even_v3(-3) = False, is_odd_v3(-3) = True
is_even_v3(-2) = True, is_odd_v3(-2) = False
is_even_v3(-1) = False, is_odd_v3(-1) = True
is_even_v3( 0) = True, is_odd_v3( 0) = False
is_even_v3( 1) = False, is_odd_v3( 1) = True
is_even_v3( 2) = True, is_odd_v3( 2) = False
is_even_v3( 3) = False, is_odd_v3( 3) = True

Comparing Approaches

Approach 1 (convert to absolute value at start): - Pros: Simple, handles negatives early - Cons: Extra check on every call

Approach 2 (multiple base cases): - Pros: More explicit base cases, clearer logic - Cons: More code, base cases in both functions

Both work correctly. Choose based on clarity preference.

Generalizing: Mutual Recursion for Modular Arithmetic

The even/odd pattern generalizes to checking divisibility by any number. Let's implement "divisible by 3" using mutual recursion:

# Generalization: Divisible by 3 using mutual recursion
def divisible_by_3(n):
    """
    Check if n is divisible by 3 using mutual recursion.
    Base cases: 0 is divisible by 3, 1 and 2 are not.
    """
    if n < 0:
        n = -n

    if n == 0:
        return True
    if n == 1 or n == 2:
        return False

    # n is divisible by 3 if n-3 is divisible by 3
    return divisible_by_3(n - 3)

def not_divisible_by_3(n):
    """
    Check if n is NOT divisible by 3 using mutual recursion.
    This is just the negation of divisible_by_3.
    """
    return not divisible_by_3(n)

# Test divisibility by 3
print("\nTesting divisibility by 3:")
for n in range(15):
    div3 = divisible_by_3(n)
    print(f"divisible_by_3({n:2}) = {div3}")

Python Output:

Testing divisibility by 3:
divisible_by_3( 0) = True
divisible_by_3( 1) = False
divisible_by_3( 2) = False
divisible_by_3( 3) = True
divisible_by_3( 4) = False
divisible_by_3( 5) = False
divisible_by_3( 6) = True
divisible_by_3( 7) = False
divisible_by_3( 8) = False
divisible_by_3( 9) = True
divisible_by_3(10) = False
divisible_by_3(11) = False
divisible_by_3(12) = True
divisible_by_3(13) = False
divisible_by_3(14) = False

The Pattern: Mutual Recursion for Cyclic Properties

The even/odd example demonstrates a general pattern: mutual recursion for cyclic properties.

When to use this pattern: - Property has cyclic nature (even/odd, divisibility) - States alternate in a predictable way - Each state can be defined in terms of the other

The structure:

def state_A(n):
    if base_case_A:
        return result_A
    return state_B(n - step)

def state_B(n):
    if base_case_B:
        return result_B
    return state_A(n - step)

Real-world applications: - Parity checking (even/odd) - State machines with two states - Alternating player turns in games - Ping-pong protocols

Complexity Analysis

Time Complexity: O(n) - Each call decrements n by 1 (or by step size) - Total calls: n / step - For even/odd: O(n) calls

Space Complexity: O(n) - Call stack depth: n / step - Each call stores constant data - For even/odd: O(n) stack frames

Why this is inefficient: For even/odd checking, n % 2 is O(1) time and O(1) space. The recursive approach is purely pedagogical—it demonstrates mutual recursion clearly but is not practical.

When mutual recursion IS practical: - Parsing (expression evaluation, tokenizing) - State machines (protocol handling) - Game AI (minimax with alternating players) - Grammar-based processing

In these cases, mutual recursion provides clarity and structure that outweighs the overhead.

Advanced: Recursive Descent Parsing Basics

Recursive Descent Parsing: The Professional Application

Now we'll apply everything we've learned to build a recursive descent parser—a real-world technique used in compilers and interpreters. This is where mutual recursion truly shines.

What is Recursive Descent Parsing?

Recursive descent parsing is a top-down parsing technique where: 1. Each grammar rule becomes a function 2. Functions call each other based on grammar structure 3. The parser "descends" through the grammar recursively

This is the technique used in many production compilers (Python's own parser uses a variant of this).

The Reference Problem: Boolean Expression Evaluator

Let's build an evaluator for boolean expressions with: - Literals: true, false - Operators: and, or, not - Parentheses: (, ) - Precedence: not > and > or

Valid expressions: - true and false → False - not true or false → False - (true or false) and true → True - not (true and false) → True

The Grammar

First, we define the grammar formally:

expression = or_expr
or_expr    = and_expr ('or' and_expr)*
and_expr   = not_expr ('and' not_expr)*
not_expr   = 'not' not_expr | primary
primary    = 'true' | 'false' | '(' expression ')'

Notice the structure: - Lower precedence (or) at the top - Higher precedence (and) called by lower - Highest precedence (not) called by higher - primary handles base cases and parentheses

This grammar naturally maps to mutually recursive functions.

Iteration 1: The Complete Parser

# Iteration 1: Boolean expression parser with recursive descent
class BooleanParser:
    def __init__(self, expr):
        self.tokens = self._tokenize(expr)
        self.index = 0

    def _tokenize(self, expr):
        """Convert expression string to list of tokens."""
        tokens = []
        i = 0
        while i < len(expr):
            # Skip whitespace
            if expr[i].isspace():
                i += 1
                continue

            # Multi-character tokens
            if expr[i:i+5] == 'false':
                tokens.append(('BOOL', False))
                i += 5
            elif expr[i:i+4] == 'true':
                tokens.append(('BOOL', True))
                i += 4
            elif expr[i:i+3] == 'and':
                tokens.append(('AND', 'and'))
                i += 3
            elif expr[i:i+3] == 'not':
                tokens.append(('NOT', 'not'))
                i += 3
            elif expr[i:i+2] == 'or':
                tokens.append(('OR', 'or'))
                i += 2
            elif expr[i] == '(':
                tokens.append(('LPAREN', '('))
                i += 1
            elif expr[i] == ')':
                tokens.append(('RPAREN', ')'))
                i += 1
            else:
                raise ValueError(f"Unexpected character: '{expr[i]}'")

        return tokens

    def parse(self):
        """Entry point: parse the entire expression."""
        result = self.parse_expression()
        if self.index < len(self.tokens):
            raise ValueError(f"Unexpected token: {self.tokens[self.index]}")
        return result

    def parse_expression(self):
        """
        Parse an expression (lowest precedence: or).
        Grammar: expression = or_expr
        """
        return self.parse_or_expr()

    def parse_or_expr(self):
        """
        Parse OR expression (lowest precedence).
        Grammar: or_expr = and_expr ('or' and_expr)*
        Mutual recursion: calls parse_and_expr()
        """
        # Get first operand (higher precedence)
        result = self.parse_and_expr()

        # Process any 'or' operators
        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
            self.index += 1  # Skip 'or'
            right = self.parse_and_expr()
            result = result or right

        return result

    def parse_and_expr(self):
        """
        Parse AND expression (higher precedence).
        Grammar: and_expr = not_expr ('and' not_expr)*
        Mutual recursion: calls parse_not_expr()
        """
        # Get first operand (higher precedence)
        result = self.parse_not_expr()

        # Process any 'and' operators
        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
            self.index += 1  # Skip 'and'
            right = self.parse_not_expr()
            result = result and right

        return result

    def parse_not_expr(self):
        """
        Parse NOT expression (highest precedence).
        Grammar: not_expr = 'not' not_expr | primary
        Mutual recursion: calls parse_primary() or itself
        """
        # Check for 'not' operator
        if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
            self.index += 1  # Skip 'not'
            # Recursively parse the operand
            return not self.parse_not_expr()

        # Otherwise, parse primary
        return self.parse_primary()

    def parse_primary(self):
        """
        Parse primary expression (base case).
        Grammar: primary = 'true' | 'false' | '(' expression ')'
        Mutual recursion: calls parse_expression() for parentheses
        """
        if self.index >= len(self.tokens):
            raise ValueError("Unexpected end of expression")

        token_type, value = self.tokens[self.index]

        # Boolean literal
        if token_type == 'BOOL':
            self.index += 1
            return value

        # Parenthesized expression
        if token_type == 'LPAREN':
            self.index += 1  # Skip '('
            # Recursively parse the expression inside
            result = self.parse_expression()

            if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
                raise ValueError("Expected ')'")
            self.index += 1  # Skip ')'

            return result

        raise ValueError(f"Unexpected token: {token_type}")

def evaluate_boolean(expr):
    """Convenience function to parse and evaluate a boolean expression."""
    parser = BooleanParser(expr)
    return parser.parse()

# Test the boolean parser
print("Testing boolean expression parser:")
test_cases = [
    ("true", True),
    ("false", False),
    ("true and false", False),
    ("true or false", True),
    ("not true", False),
    ("not false", True),
    ("true and true and true", True),
    ("true or false and false", True),  # or has lower precedence
    ("(true or false) and false", False),  # parentheses override
    ("not true or false", False),  # not has highest precedence
    ("not (true or false)", False),
    ("true and not false", True),
]

for expr, expected in test_cases:
    result = evaluate_boolean(expr)
    status = "✓" if result == expected else "✗"
    print(f"{status} evaluate_boolean('{expr}') = {result} (expected {expected})")

Python Output:

Testing boolean expression parser:
✓ evaluate_boolean('true') = True (expected True)
✓ evaluate_boolean('false') = False (expected False)
✓ evaluate_boolean('true and false') = False (expected False)
✓ evaluate_boolean('true or false') = True (expected True)
✓ evaluate_boolean('not true') = False (expected False)
✓ evaluate_boolean('not false') = True (expected True)
✓ evaluate_boolean('true and true and true') = True (expected True)
✓ evaluate_boolean('true or false and false') = True (expected True)
✓ evaluate_boolean('(true or false) and false') = False (expected False)
✓ evaluate_boolean('not true or false') = False (expected False)
✓ evaluate_boolean('not (true or false)') = False (expected False)
✓ evaluate_boolean('true and not false') = True (expected True)

Perfect! Let's trace a complex expression to see the mutual recursion in action.

Tracing Execution: "not (true or false)"

Let's trace this expression step by step:

# Add tracing to visualize the parsing
class TracedBooleanParser(BooleanParser):
    def __init__(self, expr):
        super().__init__(expr)
        self.depth = 0

    def _indent(self):
        return "  " * self.depth

    def parse(self):
        print(f"\n{'='*60}")
        print(f"Parsing: '{' '.join(t[1] if isinstance(t[1], str) else str(t[1]) for t in self.tokens)}'")
        print(f"{'='*60}")
        result = self.parse_expression()
        print(f"\n{'='*60}")
        print(f"Final result: {result}")
        print(f"{'='*60}")
        return result

    def parse_expression(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_expression()")
        result = self.parse_or_expr()
        print(f"{self._indent()}← parse_expression() returns {result}")
        self.depth -= 1
        return result

    def parse_or_expr(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_or_expr()")
        result = self.parse_and_expr()
        print(f"{self._indent()}  Got and_expr: {result}")

        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
            print(f"{self._indent()}  Found 'or'")
            self.index += 1
            right = self.parse_and_expr()
            print(f"{self._indent()}  Got and_expr: {right}")
            result = result or right
            print(f"{self._indent()}  Result: {result}")

        print(f"{self._indent()}← parse_or_expr() returns {result}")
        self.depth -= 1
        return result

    def parse_and_expr(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_and_expr()")
        result = self.parse_not_expr()
        print(f"{self._indent()}  Got not_expr: {result}")

        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
            print(f"{self._indent()}  Found 'and'")
            self.index += 1
            right = self.parse_not_expr()
            print(f"{self._indent()}  Got not_expr: {right}")
            result = result and right
            print(f"{self._indent()}  Result: {result}")

        print(f"{self._indent()}← parse_and_expr() returns {result}")
        self.depth -= 1
        return result

    def parse_not_expr(self):
        self.depth += 1
        print(f"{self._indent()}→ parse_not_expr()")

        if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
            print(f"{self._indent()}  Found 'not'")
            self.index += 1
            result = not self.parse_not_expr()
            print(f"{self._indent()}← parse_not_expr() returns {result}")
            self.depth -= 1
            return result

        result = self.parse_primary()
        print(f"{self._indent()}← parse_not_expr() returns {result}")
        self.depth -= 1
        return result

    def parse_primary(self):
        self.depth += 1
        token_type, value = self.tokens[self.index]
        print(f"{self._indent()}→ parse_primary() sees {token_type}")

        if token_type == 'BOOL':
            self.index += 1
            print(f"{self._indent()}← parse_primary() returns {value}")
            self.depth -= 1
            return value

        if token_type == 'LPAREN':
            print(f"{self._indent()}  Found '(', parsing sub-expression")
            self.index += 1
            result = self.parse_expression()
            self.index += 1  # Skip ')'
            print(f"{self._indent()}← parse_primary() returns {result}")
            self.depth -= 1
            return result

        raise ValueError(f"Unexpected token: {token_type}")

# Trace a complex expression
parser = TracedBooleanParser("not (true or false)")
result = parser.parse()

Python Output:

============================================================
Parsing: 'not ( True or False )'
============================================================
  → parse_expression()
    → parse_or_expr()
      → parse_and_expr()
        → parse_not_expr()
          Found 'not'
          → parse_not_expr()
            → parse_primary() sees LPAREN
              Found '(', parsing sub-expression
              → parse_expression()
                → parse_or_expr()
                  → parse_and_expr()
                    → parse_not_expr()
                      → parse_primary() sees BOOL
                      ← parse_primary() returns True
                    ← parse_not_expr() returns True
                  ← parse_and_expr() returns True
                  Got and_expr: True
                  Found 'or'
                  → parse_and_expr()
                    → parse_not_expr()
                      → parse_primary() sees BOOL
                      ← parse_primary() returns False
                    ← parse_not_expr() returns False
                  ← parse_and_expr() returns False
                  Got and_expr: False
                  Result: True
                ← parse_or_expr() returns True
              ← parse_expression() returns True
            ← parse_primary() returns True
          ← parse_not_expr() returns True
        ← parse_not_expr() returns False
      ← parse_and_expr() returns False
    ← parse_or_expr() returns False
  ← parse_expression() returns False

============================================================
Final result: False
============================================================

Understanding the Mutual Recursion Flow

Let's parse this trace section by section:

1. Entry: parse_expression() → parse_or_expr() → parse_and_expr() → parse_not_expr()

2. NOT operator found: parse_not_expr() sees not and recursively calls itself

3. Parenthesis found: The recursive parse_not_expr() calls parse_primary(), which sees ( and calls parse_expression() to parse the sub-expression

4. Sub-expression parsing: Inside the parentheses, true or false is parsed: - parse_or_expr() gets first operand: true - Sees or operator - Gets second operand: false - Computes true or false = true

5. Return from parentheses: parse_primary() returns true to parse_not_expr()

6. Apply NOT: parse_not_expr() computes not true = false

7. Final result: false

The Mutual Recursion Cycle

Here's the call pattern for not (true or false):

parse_expression()
  ↓
parse_or_expr()
  ↓
parse_and_expr()
  ↓
parse_not_expr() [sees 'not']
  ↓
parse_not_expr() [recursive call]
  ↓
parse_primary() [sees '(']
  ↓
parse_expression() [for sub-expression]
  ↓
parse_or_expr() [processes 'true or false']
  ↓
parse_and_expr() [gets 'true']
  ↓
parse_not_expr()
  ↓
parse_primary() [returns 'true']
  ↑
[returns 'true']
  ↑
[sees 'or']
  ↓
parse_and_expr() [gets 'false']
  ↓
parse_not_expr()
  ↓
parse_primary() [returns 'false']
  ↑
[computes 'true or false' = 'true']
  ↑
[returns 'true' to parse_primary()]
  ↑
[returns 'true' to parse_not_expr()]
  ↑
[computes 'not true' = 'false']
  ↑
[returns 'false']

The key insight: The parser naturally handles precedence and parentheses through the mutual recursion structure. Lower precedence functions call higher precedence functions, and parentheses restart the cycle from the top.

Extending the Parser: Adding Variables

Let's extend our parser to support variables:

# Iteration 2: Extended parser with variables
class ExtendedBooleanParser(BooleanParser):
    def __init__(self, expr, variables=None):
        super().__init__(expr)
        self.variables = variables or {}

    def _tokenize(self, expr):
        """Override to handle variable names."""
        tokens = []
        i = 0
        while i < len(expr):
            if expr[i].isspace():
                i += 1
                continue

            # Check for keywords first
            if expr[i:i+5] == 'false':
                tokens.append(('BOOL', False))
                i += 5
            elif expr[i:i+4] == 'true':
                tokens.append(('BOOL', True))
                i += 4
            elif expr[i:i+3] == 'and':
                tokens.append(('AND', 'and'))
                i += 3
            elif expr[i:i+3] == 'not':
                tokens.append(('NOT', 'not'))
                i += 3
            elif expr[i:i+2] == 'or':
                tokens.append(('OR', 'or'))
                i += 2
            elif expr[i] == '(':
                tokens.append(('LPAREN', '('))
                i += 1
            elif expr[i] == ')':
                tokens.append(('RPAREN', ')'))
                i += 1
            # Handle variable names (letters)
            elif expr[i].isalpha():
                start = i
                while i < len(expr) and expr[i].isalnum():
                    i += 1
                var_name = expr[start:i]
                tokens.append(('VAR', var_name))
            else:
                raise ValueError(f"Unexpected character: '{expr[i]}'")

        return tokens

    def parse_primary(self):
        """Override to handle variables."""
        if self.index >= len(self.tokens):
            raise ValueError("Unexpected end of expression")

        token_type, value = self.tokens[self.index]

        if token_type == 'BOOL':
            self.index += 1
            return value

        # Handle variables
        if token_type == 'VAR':
            self.index += 1
            if value not in self.variables:
                raise ValueError(f"Undefined variable: {value}")
            return self.variables[value]

        if token_type == 'LPAREN':
            self.index += 1
            result = self.parse_expression()
            if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
                raise ValueError("Expected ')'")
            self.index += 1
            return result

        raise ValueError(f"Unexpected token: {token_type}")

def evaluate_with_vars(expr, variables):
    """Evaluate boolean expression with variables."""
    parser = ExtendedBooleanParser(expr, variables)
    return parser.parse()

# Test with variables
print("\nTesting parser with variables:")
variables = {'x': True, 'y': False, 'z': True}
test_cases = [
    ("x", True),
    ("y", False),
    ("x and y", False),
    ("x or y", True),
    ("not x", False),
    ("x and (y or z)", True),
    ("(x or y) and not z", False),
]

print(f"Variables: {variables}\n")
for expr, expected in test_cases:
    result = evaluate_with_vars(expr, variables)
    status = "✓" if result == expected else "✗"
    print(f"{status} evaluate('{expr}') = {result} (expected {expected})")

Python Output:

Testing parser with variables:
Variables: {'x': True, 'y': False, 'z': True}

✓ evaluate('x') = True (expected True)
✓ evaluate('y') = False (expected False)
✓ evaluate('x and y') = False (expected False)
✓ evaluate('x or y') = True (expected True)
✓ evaluate('not x') = False (expected False)
✓ evaluate('x and (y or z)') = True (expected True)
✓ evaluate('(x or y) and not z') = False (expected False)

Perfect! We've built a complete boolean expression evaluator with variables using recursive descent parsing.

The Journey: From Problem to Solution

Iteration	Feature Added	Technique Applied	Result
0	Basic arithmetic parser	Mutual recursion for precedence	Handles +, -, *, / correctly
1	Boolean expression parser	Recursive descent parsing	Handles and, or, not with precedence
2	Variable support	Extended parse_primary()	Evaluates expressions with variables

When to Apply Recursive Descent Parsing

What it optimizes for: - Natural mapping from grammar to code - Clear precedence handling - Easy to extend with new operators - Readable and maintainable

What it sacrifices: - Deeper call stack (one function per precedence level) - More code than table-driven parsers - Not suitable for ambiguous grammars

When to choose this approach: - Building compilers or interpreters - Parsing configuration files - Evaluating expressions - Implementing domain-specific languages (DSLs)

When to avoid this approach: - Grammar is ambiguous or left-recursive - Need maximum performance (use table-driven parser) - Grammar is very simple (regex might suffice)

Complexity characteristics: - Time complexity: O(n) where n is input length (each token processed once) - Space complexity: O(d) where d is maximum nesting depth (call stack) - Call stack depth: O(d) where d is maximum expression nesting

Real-World Applications

Recursive descent parsing is used in:

Programming language compilers: Python, Go, Rust use variants of this technique
Configuration parsers: YAML, TOML parsers
Query languages: SQL parsers, GraphQL parsers
Markup languages: Markdown parsers, HTML parsers
Domain-specific languages: Custom scripting languages

The pattern we've learned—mutual recursion following grammar structure—is the foundation of modern compiler design.

Synthesis and Mastery

The Complete Picture: Mutual Recursion Patterns

We've explored three major applications of mutual recursion:

1. Expression Evaluation (Precedence Handling)

Pattern:

def low_precedence():
    result = high_precedence()
    while operator_at_this_level():
        result = combine(result, high_precedence())
    return result

def high_precedence():
    # Either call even higher precedence, or parse base case
    pass

Key insight: Lower precedence calls higher precedence to get operands.

Applications: - Arithmetic expression evaluators - Boolean expression evaluators - Query language parsers

2. State Machine Tokenizing

Pattern:

def state_start():
    if condition_A:
        state_A()
    elif condition_B:
        state_B()
    # ...

def state_A():
    # Process state A
    state_start()  # Return to dispatcher

def state_B():
    # Process state B
    state_start()  # Return to dispatcher

Key insight: Each state is a function; state transitions are function calls.

Applications: - Lexical analysis (tokenizing) - Protocol parsing - Text processing with modes

3. Cyclic Property Checking

Pattern:

def property_A(n):
    if base_case_A:
        return result_A
    return property_B(n - step)

def property_B(n):
    if base_case_B:
        return result_B
    return property_A(n - step)

Key insight: Properties defined in terms of each other.

Applications: - Parity checking (even/odd) - Game state evaluation (alternating players) - Mutual dependencies

Decision Framework: When to Use Mutual Recursion

Choose Mutual Recursion When:

✅ Problem has natural hierarchical structure (grammar, precedence) ✅ States are interdependent (each defined in terms of others) ✅ Clear separation of concerns (each function handles one level/state) ✅ Extensibility matters (easy to add new levels/states) ✅ Readability is priority (matches conceptual model)

Avoid Mutual Recursion When:

❌ Simple iteration suffices (linear processing) ❌ Stack depth is concern (very deep nesting) ❌ Performance is critical (table-driven approach faster) ❌ Problem doesn't have clear states (forced abstraction)

Comparison with Alternatives

Approach	Pros	Cons	Best For
Mutual Recursion	Clear structure, matches grammar, extensible	Deeper stack, more code	Parsing, state machines
Single Recursive Function	Simpler, shallower stack	Tangled logic, hard to extend	Simple recursive problems
Iteration	Fastest, no stack overhead	Less clear for hierarchical problems	Linear processing
Table-Driven	Very fast, data-driven	Complex setup, less readable	Performance-critical parsing

Common Failure Modes and Their Signatures

Symptom: RecursionError with Mutual Recursion

Python output pattern:

RecursionError: maximum recursion depth exceeded

Diagnostic clues: - Call stack shows alternating function names - No base case reached - Parameters not progressing toward base case

Root cause: Missing or incorrect base case in one of the mutually recursive functions

Solution: Add base cases to all functions in the cycle

Symptom: Incorrect Precedence in Expression Parser

Python output pattern:

evaluate("2 + 3 * 4") = 20  # Expected: 14

Diagnostic clues: - Left-to-right evaluation instead of precedence - All operators treated equally

Root cause: Not using mutual recursion for precedence levels

Solution: Create separate functions for each precedence level, with lower precedence calling higher precedence

Symptom: Infinite Loop in State Machine

Python output pattern:

[Program hangs, no output]

Diagnostic clues: - State functions call each other without advancing input - No progress toward termination

Root cause: State transitions don't consume input or advance state

Solution: Ensure each state transition advances the input index or changes state meaningfully

Debugging Workflow: When Your Mutually Recursive Functions Fail

Step 1: Identify the Cycle

Question: Which functions call each other?

Action: Draw the call graph:

function_A → function_B → function_C → function_A

Step 2: Trace One Complete Cycle

Question: What happens in one full cycle through all functions?

Action: Add print statements at entry/exit of each function:

def function_A(n):
    print(f"→ function_A({n})")
    result = function_B(n - 1)
    print(f"← function_A({n}) returns {result}")
    return result

Step 3: Verify Base Cases

Question: Does each function have a base case?

Action: Check that every function in the cycle can terminate without calling another function in the cycle.

Step 4: Verify Progress Toward Base Case

Question: Do parameters move toward base case with each call?

Action: Trace parameter values through the cycle. Ensure they approach the base case condition.

Step 5: Check State Transitions

Question: For state machines, does each state advance the input?

Action: Verify that self.index or equivalent advances in each state function.

Step 6: Validate Grammar Structure

Question: For parsers, does the grammar match the code structure?

Action: Compare grammar rules to function structure. Each grammar rule should map to a function.

Complexity Analysis: The Cost of Mutual Recursion

Time Complexity

For most mutual recursion patterns: - Expression parsing: O(n) where n is input length (each token processed once) - State machine tokenizing: O(n) where n is input length (each character processed once) - Cyclic property checking: O(n) where n is the parameter value (decrements to base case)

Key insight: Time complexity is usually linear in input size, despite the mutual recursion.

Space Complexity

The critical factor is call stack depth: - Expression parsing: O(d) where d is maximum nesting depth (parentheses) - State machine tokenizing: O(s) where s is number of states (typically O(1) since states return quickly) - Cyclic property checking: O(n) where n is the parameter value (one call per decrement)

Key insight: Space complexity depends on maximum recursion depth, not total number of calls.

Recurrence Relations

For expression parsing with precedence:

T(n) = T(n/k) + O(1)  where k is number of precedence levels

This solves to O(n) by the Master Theorem.

For cyclic property checking:

T(n) = T(n-1) + O(1)

This solves to O(n) by telescoping.

Final Implementation: Production-Ready Boolean Parser

Here's a complete, production-ready boolean expression parser with all best practices:

# Production-ready boolean expression parser
class ProductionBooleanParser:
    """
    A robust boolean expression parser with:
    - Proper error messages
    - Variable support
    - Operator precedence (not > and > or)
    - Parentheses support
    - Input validation
    """

    def __init__(self, expr, variables=None):
        self.expr = expr
        self.variables = variables or {}
        self.tokens = []
        self.index = 0

    def parse(self):
        """Parse and evaluate the boolean expression."""
        try:
            self._tokenize()
            result = self._parse_expression()

            if self.index < len(self.tokens):
                token = self.tokens[self.index]
                raise ValueError(
                    f"Unexpected token '{token[1]}' at position {self._get_position(self.index)}"
                )

            return result

        except ValueError as e:
            raise ValueError(f"Parse error in '{self.expr}': {e}")

    def _tokenize(self):
        """Convert expression string to list of tokens."""
        self.tokens = []
        i = 0

        while i < len(self.expr):
            # Skip whitespace
            if self.expr[i].isspace():
                i += 1
                continue

            # Multi-character tokens (check longest first)
            if self.expr[i:i+5] == 'false':
                self.tokens.append(('BOOL', False, i))
                i += 5
            elif self.expr[i:i+4] == 'true':
                self.tokens.append(('BOOL', True, i))
                i += 4
            elif self.expr[i:i+3] == 'and':
                self.tokens.append(('AND', 'and', i))
                i += 3
            elif self.expr[i:i+3] == 'not':
                self.tokens.append(('NOT', 'not', i))
                i += 3
            elif self.expr[i:i+2] == 'or':
                self.tokens.append(('OR', 'or', i))
                i += 2
            elif self.expr[i] == '(':
                self.tokens.append(('LPAREN', '(', i))
                i += 1
            elif self.expr[i] == ')':
                self.tokens.append(('RPAREN', ')', i))
                i += 1
            # Variable names
            elif self.expr[i].isalpha():
                start = i
                while i < len(self.expr) and self.expr[i].isalnum():
                    i += 1
                var_name = self.expr[start:i]
                self.tokens.append(('VAR', var_name, start))
            else:
                raise ValueError(
                    f"Unexpected character '{self.expr[i]}' at position {i}"
                )

    def _get_position(self, token_index):
        """Get the position in the original expression for error messages."""
        if token_index < len(self.tokens):
            return self.tokens[token_index][2]
        return len(self.expr)

    def _parse_expression(self):
        """Parse expression (entry point)."""
        return self._parse_or_expr()

    def _parse_or_expr(self):
        """Parse OR expression (lowest precedence)."""
        result = self._parse_and_expr()

        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
            self.index += 1
            right = self._parse_and_expr()
            result = result or right

        return result

    def _parse_and_expr(self):
        """Parse AND expression (higher precedence)."""
        result = self._parse_not_expr()

        while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
            self.index += 1
            right = self._parse_not_expr()
            result = result and right

        return result

    def _parse_not_expr(self):
        """Parse NOT expression (highest precedence)."""
        if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
            self.index += 1
            return not self._parse_not_expr()

        return self._parse_primary()

    def _parse_primary(self):
        """Parse primary expression (base case)."""
        if self.index >= len(self.tokens):
            raise ValueError("Unexpected end of expression")

        token_type, value, pos = self.tokens[self.index]

        # Boolean literal
        if token_type == 'BOOL':
            self.index += 1
            return value

        # Variable
        if token_type == 'VAR':
            self.index += 1
            if value not in self.variables:
                raise ValueError(
                    f"Undefined variable '{value}' at position {pos}"
                )
            return self.variables[value]

        # Parenthesized expression
        if token_type == 'LPAREN':
            self.index += 1
            result = self._parse_expression()

            if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
                raise ValueError(
                    f"Expected ')' at position {self._get_position(self.index)}"
                )
            self.index += 1
            return result

        raise ValueError(
            f"Expected boolean, variable, or '(' at position {pos}, got '{value}'"
        )

def evaluate_boolean_expr(expr, variables=None):
    """
    Evaluate a boolean expression with optional variables.

    Args:
        expr: Boolean expression string
        variables: Dictionary of variable names to boolean values

    Returns:
        Boolean result of the expression

    Raises:
        ValueError: If expression is invalid or variables are undefined
    """
    parser = ProductionBooleanParser(expr, variables)
    return parser.parse()

# Comprehensive test suite
print("\nProduction parser test suite:")
print("="*60)

# Test basic operations
print("\n1. Basic operations:")
tests = [
    ("true", {}, True),
    ("false", {}, False),
    ("not true", {}, False),
    ("true and false", {}, False),
    ("true or false", {}, True),
]
for expr, vars, expected in tests:
    result = evaluate_boolean_expr(expr, vars)
    status = "✓" if result == expected else "✗"
    print(f"{status} {expr:20} = {result}")

# Test precedence
print("\n2. Operator precedence:")
tests = [
    ("not true or false", {}, False),  # not has highest precedence
    ("true or false and false", {}, True),  # and before or
    ("not true and false", {}, False),  # not before and
]
for expr, vars, expected in tests:
    result = evaluate_boolean_expr(expr, vars)
    status = "✓" if result == expected else "✗"
    print(f"{status} {expr:30} = {result}")

# Test parentheses
print("\n3. Parentheses:")
tests = [
    ("(true or false) and false", {}, False),
    ("not (true and false)", {}, True),
    ("(not true) or false", {}, False),
]
for expr, vars, expected in tests:
    result = evaluate_boolean_expr(expr, vars)
    status = "✓" if result == expected else "✗"
    print(f"{status} {expr:30} = {result}")

# Test variables
print("\n4. Variables:")
vars = {'x': True, 'y': False, 'z': True}
tests = [
    ("x and y", vars, False),
    ("x or y", vars, True),
    ("not x", vars, False),
    ("x and (y or z)", vars, True),
]
print(f"Variables: {vars}")
for expr, vars, expected in tests:
    result = evaluate_boolean_expr(expr, vars)
    status = "✓" if result == expected else "✗"
    print(f"{status} {expr:30} = {result}")

# Test error handling
print("\n5. Error handling:")
error_cases = [
    ("true and", {}),
    ("(true or false", {}),
    ("true false", {}),
    ("x and y", {}),  # undefined variables
]
for expr, vars in error_cases:
    try:
        result = evaluate_boolean_expr(expr, vars)
        print(f"✗ {expr:30} should have raised error")
    except ValueError as e:
        print(f"✓ {expr:30} raised: {str(e)[:40]}...")

print("\n" + "="*60)

Python Output:

Production parser test suite:
============================================================

1. Basic operations:
✓ true                 = True
✓ false                = False
✓ not true             = False
✓ true and false       = False
✓ true or false        = True

2. Operator precedence:
✓ not true or false            = False
✓ true or false and false      = True
✓ not true and false           = False

3. Parentheses:
✓ (true or false) and false    = False
✓ not (true and false)         = True
✓ (not true) or false          = False

4. Variables:
Variables: {'x': True, 'y': False, 'z': True}
✓ x and y                      = False
✓ x or y                       = True
✓ not x                        = False
✓ x and (y or z)               = True

5. Error handling:
✓ true and                     raised: Parse error in 'true and': Unexpec...
✓ (true or false               raised: Parse error in '(true or false': E...
✓ true false                   raised: Parse error in 'true false': Unexp...
✓ x and y                      raised: Parse error in 'x and y': Undefine...

============================================================

Lessons Learned: The Essence of Mutual Recursion

1. Structure Follows Grammar

When your problem has a formal grammar or hierarchical structure, mutual recursion provides a natural mapping from specification to implementation. Each grammar rule becomes a function.

2. Separation of Concerns

Mutual recursion allows you to separate different aspects of a problem into different functions. Each function handles one level of precedence, one state, or one type of entity.

3. Extensibility Through Modularity

Adding new features (operators, states, token types) is straightforward: add a new function and update the dispatcher. The existing functions remain unchanged.

4. The Cost of Clarity

Mutual recursion trades stack space for code clarity. The call stack grows deeper, but the code structure matches the problem structure, making it easier to understand and maintain.

5. Base Cases Are Critical

In mutual recursion, every function in the cycle must have a path to termination. Missing base cases in any function can cause infinite recursion.

6. Debugging Requires Systematic Tracing

When debugging mutually recursive functions, trace the complete cycle. Understand how functions call each other and how parameters change with each call.

7. Not Always the Best Tool

Mutual recursion is powerful for specific problem types (parsing, state machines), but simple iteration is often clearer for linear processing. Choose the right tool for the problem.

Mastery Checklist

You've mastered mutual recursion when you can:

✅ Identify when a problem naturally maps to mutual recursion
✅ Design mutually recursive functions from a grammar specification
✅ Implement state machines using mutual recursion
✅ Handle operator precedence through function hierarchy
✅ Debug mutually recursive functions systematically
✅ Analyze time and space complexity of mutual recursion
✅ Recognize when mutual recursion is overkill
✅ Extend mutually recursive systems with new features
✅ Write production-ready parsers using recursive descent
✅ Explain the trade-offs between mutual recursion and alternatives

Congratulations! You now understand one of the most sophisticated applications of recursion: mutual recursion for parsing and state machines. This knowledge forms the foundation of compiler design, interpreter implementation, and many other advanced programming techniques.